mean shift algorithm
Local Cluster Cardinality Estimation for Adaptive Mean Shift
This article presents an adaptive mean shift algorithm designed for datasets with varying local scale and cluster cardinality. Local distance distributions, from a point to all others, are used to estimate the cardinality of the local cluster by identifying a local minimum in the density of the distance distribution. Based on these cardinality estimates, local cluster parameters are then computed for the entire cluster in contrast to KDE-based methods, which provide insight only into localized regions of the cluster. During the mean shift execution, the cluster cardinality estimate is used to adaptively adjust the bandwidth and the mean shift kernel radius threshold. Our algorithm outperformed a recently proposed adaptive mean shift method on its original dataset and demonstrated competitive performance on a broader clustering benchmark.
- Oceania > Australia (0.04)
- North America > United States > New Jersey > Hudson County > Hoboken (0.04)
- North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
- North America > United States > California > Monterey County > Pacific Grove (0.04)
Weighted quantization using MMD: From mean field to mean shift via gradient flows
Belhadji, Ayoub, Sharp, Daniel, Marzouk, Youssef
Approximating a probability distribution using a set of particles is a fundamental problem in machine learning and statistics, with applications including clustering and quantization. Formally, we seek a finite weighted mixture of Dirac measures that best approximates the target distribution. While much existing work relies on the Wasserstein distance to quantify approximation errors, maximum mean discrepancy (MMD) has received comparatively less attention, especially when allowing for variable particle weights. We study the quantization problem from the perspective of minimizing MMD via gradient flow in the Wasserstein-Fisher-Rao (WFR) geometry. This gradient flow yields an ODE system from which we further derive a fixed-point algorithm called mean shift interacting particles (MSIP). We show that MSIP extends the (non-interacting) mean shift algorithm, widely used for identifying modes in kernel density estimates. Moreover, we show that MSIP can be interpreted as preconditioned gradient descent, and that it acts as a relaxation of Lloyd's algorithm for clustering. Our numerical experiments demonstrate that MSIP and the WFR ODEs outperform other algorithms for quantization of multi-modal and high-dimensional targets.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
Understanding Mean Shift Clustering(Artficial Intelligence)
Abstract: In this study, a novel method for the construction of a driving cycle based on Mean Shift clustering is proposed to solve the problems existing in the traditional micro-trips method. Firstly, 1701 kinematic segments are obtained by processing and dividing the driving data in real road conditions. Secondly, 12 kinematic parameters are calculated for each segment, and the dimensionality of parameters is reduced through principal component analysis (PCA). Three principal components are chosen to classify all cycles into three types by the Mean Shift algorithm. Finally, according to the principle of minimum deviation, representative micro-trips are selected from each type of cycle to complete the construction of the final driving cycle.
Best Papers to Read on the Mean Shift Algorithm
Abstract: Two important nonparametric approaches to clustering emerged in the 1970's: clustering by level sets or cluster tree as proposed by Hartigan, and clustering by gradient lines or gradient flow as proposed by Fukunaga and Hosteler. In a recent paper, we argue the thesis that these two approaches are fundamentally the same by showing that the gradient flow provides a way to move along the cluster tree. In making a stronger case, we are confronted with the fact the cluster tree does not define a partition of the entire support of the underlying density, while the gradient flow does. Abstract: Mean shift is a simple interactive procedure that gradually shifts data points towards the mode which denotes the highest density of data points in the region. Mean shift algorithms have been effectively used for data denoising, mode seeking, and finding the number of clusters in a dataset in an automated fashion.
Mode and Ridge Estimation in Euclidean and Directional Product Spaces: A Mean Shift Approach
The set of local modes and the ridge lines estimated from a dataset are important summary characteristics of the data-generating distribution. In this work, we consider estimating the local modes and ridges from point cloud data in a product space with two or more Euclidean/directional metric spaces. Specifically, we generalize the well-known (subspace constrained) mean shift algorithm to the product space setting and illuminate some pitfalls in such generalization. We derive the algorithmic convergence of the proposed method, provide practical guidelines on the implementation, and demonstrate its effectiveness on both simulated and real datasets.
- North America > United States > Washington > King County > Seattle (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States > Utah (0.04)
- (16 more...)
- Energy (0.67)
- Health & Medicine (0.67)
- Government > Regional Government > North America Government > United States Government (0.45)
Clustering dynamics on graphs: from spectral clustering to mean shift through Fokker-Planck interpolation
Craig, Katy, Trillos, Nicolás García, Slepčev, Dejan
In this work we build a unifying framework to interpolate between density-driven and geometry-based algorithms for data clustering, and specifically, to connect the mean shift algorithm with spectral clustering at discrete and continuum levels. We seek this connection through the introduction of Fokker-Planck equations on data graphs. Besides introducing new forms of mean shift algorithms on graphs, we provide new theoretical insights on the behavior of the family of diffusion maps in the large sample limit as well as provide new connections between diffusion maps and mean shift dynamics on a fixed graph. Several numerical examples illustrate our theoretical findings and highlight the benefits of interpolating density-driven and geometry-based clustering algorithms.
- Asia > Middle East > Jordan (0.04)
- North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Europe > Switzerland > Basel-City > Basel (0.04)
- Research Report (0.50)
- Instructional Material (0.45)
The EM Perspective of Directional Mean Shift Algorithm
The directional mean shift (DMS) algorithm is a nonparametric method for pursuing local modes of densities defined by kernel density estimators on the unit hypersphere. In this paper, we show that any DMS iteration can be viewed as a generalized Expectation-Maximization (EM) algorithm; in particular, when the von Mises kernel is applied, it becomes an exact EM algorithm. Under the (generalized) EM framework, we provide a new proof for the ascending property of density estimates and demonstrate the global convergence of directional mean shift sequences. Finally, we give a new insight into the linear convergence of the DMS algorithm.
- North America > United States > Washington > King County > Seattle (0.14)
- North America > United States > New York (0.04)
- Asia > China (0.04)
Kernel Smoothing, Mean Shift, and Their Learning Theory with Directional Data
Directional data consist of observations distributed on a (hyper)sphere, and appear in many applied fields, such as astronomy, ecology, and environmental science. This paper studies both statistical and computational problems of kernel smoothing for directional data. We generalize the classical mean shift algorithm to directional data, which allows us to identify local modes of the directional kernel density estimator (KDE). The statistical convergence rates of the directional KDE and its derivatives are derived, and the problem of mode estimation is examined. We also prove the ascending property of our directional mean shift algorithm and investigate a general problem of gradient ascent on the unit hypersphere. To demonstrate the applicability of our proposed algorithm, we evaluate it as a mode clustering method on both simulated and real-world datasets.
- North America > United States > Washington > King County > Seattle (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (14 more...)
- Government (0.45)
- Energy (0.45)
Generalized mean shift with triangular kernel profile
Razakarivony, Sébastien, Barrau, Axel
The mean shift algorithm is a popular way to find modes of some probability density functions taking a specific kernel-based shape, used for clustering or visual tracking. Since its introduction, it underwent several practical improvements and generalizations, as well as deep theoretical analysis mainly focused on its convergence properties. In spite of encouraging results, this question has not received a clear general answer yet. In this paper we focus on a specific class of kernels, adapted in particular to the distributions clustering applications which motivated this work. We show that a novel Mean Shift variant adapted to them can be derived, and proved to converge after a finite number of iterations. In order to situate this new class of methods in the general picture of the Mean Shift theory, we alo give a synthetic exposure of existing results of this field.
Statistical Inference Using Mean Shift Denoising
In this paper, we study how the mean shift algorithm can be used to denoise a dataset. We introduce a new framework to analyze the mean shift algorithm as a denoising approach by viewing the algorithm as an operator on a distribution function. We investigate how the mean shift algorithm changes the distribution and show that data points shifted by the mean shift concentrate around high density regions of the underlying density function. By using the mean shift as a denoising method, we enhance the performance of several clustering techniques, improve the power of two-sample tests, and obtain a new method for anomaly detection.
- South America > Paraguay > Asunción > Asunción (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Norway > Eastern Norway > Oslo (0.04)